Contractivity of Bellman operator in risk averse dynamic programming with infinite horizon

نویسندگان

چکیده

The paper deals with a risk averse dynamic programming problem infinite horizon. First, the required assumptions are formulated to have well defined. Then Bellman equation is derived, which may be also seen as standalone reinforcement learning problem. fact that operator contraction proved, guaranteeing convergence of various solution algorithms used for problems, we demonstrate on value iteration and policy algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures

In this paper, we consider a finite-horizon Markov decision process (MDP) for which the objective at each stage is to minimize a quantile-based risk measure (QBRM) of the sequence of future costs; we call the overall objective a dynamic quantile-based risk measure (DQBRM). In particular, we consider optimizing dynamic risk measures where the one-step risk measures are QBRMs, a class of risk mea...

متن کامل

Dynamic linear programming games with risk-averse players

Motivated by situations in which independent agents, or players, wish to cooperate in some uncertain endeavor over time, we study dynamic linear programming games, which generalize classical linear production games to multi-period settings under uncertainty. We specifically consider that players may have risk-averse attitudes towards uncertainty, and model this risk aversion using coherent cond...

متن کامل

Risk neutral and risk averse Stochastic Dual Dynamic Programming method

In this paper we discuss risk neutral and risk averse approaches to multistage (linear) stochastic programming problems based on the Stochastic Dual Dynamic Programming (SDDP) method. We give a general description of the algorithm and present computational studies related to planning of the Brazilian interconnected power system. 2012 Elsevier B.V. All rights reserved.

متن کامل

Interchangeability principle and dynamic equations in risk averse stochastic programming

In this paper we consider interchangeability of the minimization operator with monotone risk functionals. In particular we discuss the role of strict monotonicity of the risk functionals. We also discuss implications to solutions of dynamic programming equations of risk averse multistage stochastic programming problems.

متن کامل

Stabilizing Policy Improvement for Large-Scale Infinite-Horizon Dynamic Programming

Today’s focus on sustainability within industry presents a modeling challenge that may be dealt with using dynamic programming over an infinite time horizon. However, the curse of dimensionality often results in a large number of states in these models. These large-scale models require numerically stable solution methods. The best method for infinite-horizon dynamic programming depends on both ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Operations Research Letters

سال: 2023

ISSN: ['0167-6377', '1872-7468']

DOI: https://doi.org/10.1016/j.orl.2023.01.008